Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Linear-quadratic blind source separating structure for removing show-through in scanned documents

Identifieur interne : 000557 ( Main/Exploration ); précédent : 000556; suivant : 000558

Linear-quadratic blind source separating structure for removing show-through in scanned documents

Auteurs : Farnood Merrikh-Bayat [Iran] ; Massoud Babaie-Zadeh [France] ; Christian Jutten [France]

Source :

RBID : Pascal:12-0083116

Descripteurs français

English descriptors

Abstract

Digital documents are usually degraded during the scanning process due to the contents of the backside of the scanned manuscript. This is often caused by the show-through effect, i.e. the backside image that interferes with the main front side picture due to the intrinsic transparency of the paper. This phenomenon is one of the degradations that one would like to remove especially in the field of Optical Character Recognition (OCR) or document digitalization which require denoised texts as inputs. In this paper, we first propose a novel and general nonlinear model for canceling the show-through phenomenon. A nonlinear blind source separation algorithm is used for this purpose based on a new recursive and extendible structure. However, the results are restricted due to a blurring effect that appears during the scanning process due to the light transfer function of the paper. Consequently, for improving the results, we introduce a refined separating architecture for simultaneously removing the show-through and blurring effects.

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Linear-quadratic blind source separating structure for removing show-through in scanned documents</title>
<author>
<name sortKey="Merrikh Bayat, Farnood" sort="Merrikh Bayat, Farnood" uniqKey="Merrikh Bayat F" first="Farnood" last="Merrikh-Bayat">Farnood Merrikh-Bayat</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical engineering department, Sharif university of technology, Azadi Avenue</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Babaie Zadeh, Massoud" sort="Babaie Zadeh, Massoud" uniqKey="Babaie Zadeh M" first="Massoud" last="Babaie-Zadeh">Massoud Babaie-Zadeh</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>GIPSA-lab, Department of Images and Signal</s1>
<s2>Grenoble</s2>
<s3>FRA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<settlement type="city">Grenoble</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Jutten, Christian" sort="Jutten, Christian" uniqKey="Jutten C" first="Christian" last="Jutten">Christian Jutten</name>
<affiliation wicri:level="1">
<inist:fA14 i1="03">
<s1>Institut Universitaire de France</s1>
<s2>Paris</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">12-0083116</idno>
<date when="2011">2011</date>
<idno type="stanalyst">PASCAL 12-0083116 INIST</idno>
<idno type="RBID">Pascal:12-0083116</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000105</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000667</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000111</idno>
<idno type="wicri:doubleKey">1433-2833:2011:Merrikh Bayat F:linear:quadratic:blind</idno>
<idno type="wicri:Area/Main/Merge">000563</idno>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00643471</idno>
<idno type="url">https://hal.archives-ouvertes.fr/hal-00643471</idno>
<idno type="wicri:Area/Hal/Corpus">000078</idno>
<idno type="wicri:Area/Hal/Curation">000078</idno>
<idno type="wicri:Area/Hal/Checkpoint">000079</idno>
<idno type="wicri:doubleKey">1433-2833:2011:Merrikh Bayat F:linear:quadratic:blind</idno>
<idno type="wicri:Area/Main/Merge">000335</idno>
<idno type="wicri:Area/Main/Curation">000557</idno>
<idno type="wicri:Area/Main/Exploration">000557</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Linear-quadratic blind source separating structure for removing show-through in scanned documents</title>
<author>
<name sortKey="Merrikh Bayat, Farnood" sort="Merrikh Bayat, Farnood" uniqKey="Merrikh Bayat F" first="Farnood" last="Merrikh-Bayat">Farnood Merrikh-Bayat</name>
<affiliation wicri:level="1">
<inist:fA14 i1="01">
<s1>Electrical engineering department, Sharif university of technology, Azadi Avenue</s1>
<s2>Tehran</s2>
<s3>IRN</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Iran</country>
<wicri:noRegion>Tehran</wicri:noRegion>
</affiliation>
</author>
<author>
<name sortKey="Babaie Zadeh, Massoud" sort="Babaie Zadeh, Massoud" uniqKey="Babaie Zadeh M" first="Massoud" last="Babaie-Zadeh">Massoud Babaie-Zadeh</name>
<affiliation wicri:level="1">
<inist:fA14 i1="02">
<s1>GIPSA-lab, Department of Images and Signal</s1>
<s2>Grenoble</s2>
<s3>FRA</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<settlement type="city">Grenoble</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Jutten, Christian" sort="Jutten, Christian" uniqKey="Jutten C" first="Christian" last="Jutten">Christian Jutten</name>
<affiliation wicri:level="1">
<inist:fA14 i1="03">
<s1>Institut Universitaire de France</s1>
<s2>Paris</s2>
<s3>FRA</s3>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint>
<date when="2011">2011</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Blind</term>
<term>Blind separation</term>
<term>Character recognition</term>
<term>Linear source</term>
<term>Non linear model</term>
<term>Optical character recognition</term>
<term>Source separation</term>
<term>Text</term>
<term>Transfer function</term>
<term>Transparency</term>
<term>blurring image</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance caractère</term>
<term>Texte</term>
<term>Floutage</term>
<term>Source linéaire</term>
<term>Aveugle</term>
<term>Transparence</term>
<term>Séparation aveugle</term>
<term>Séparation source</term>
<term>Modèle non linéaire</term>
<term>Fonction transfert</term>
<term>.</term>
</keywords>
<keywords scheme="mix" xml:lang="en">
<term>Blind source separation</term>
<term>Blurring effect</term>
<term>Linear-quadratic</term>
<term>Show-through reduction</term>
<term>convolutive mixtures</term>
<term>nonlinear mixtures</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Digital documents are usually degraded during the scanning process due to the contents of the backside of the scanned manuscript. This is often caused by the show-through effect, i.e. the backside image that interferes with the main front side picture due to the intrinsic transparency of the paper. This phenomenon is one of the degradations that one would like to remove especially in the field of Optical Character Recognition (OCR) or document digitalization which require denoised texts as inputs. In this paper, we first propose a novel and general nonlinear model for canceling the show-through phenomenon. A nonlinear blind source separation algorithm is used for this purpose based on a new recursive and extendible structure. However, the results are restricted due to a blurring effect that appears during the scanning process due to the light transfer function of the paper. Consequently, for improving the results, we introduce a refined separating architecture for simultaneously removing the show-through and blurring effects.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
<li>Iran</li>
</country>
<settlement>
<li>Grenoble</li>
<li>Paris</li>
</settlement>
</list>
<tree>
<country name="Iran">
<noRegion>
<name sortKey="Merrikh Bayat, Farnood" sort="Merrikh Bayat, Farnood" uniqKey="Merrikh Bayat F" first="Farnood" last="Merrikh-Bayat">Farnood Merrikh-Bayat</name>
</noRegion>
</country>
<country name="France">
<noRegion>
<name sortKey="Babaie Zadeh, Massoud" sort="Babaie Zadeh, Massoud" uniqKey="Babaie Zadeh M" first="Massoud" last="Babaie-Zadeh">Massoud Babaie-Zadeh</name>
</noRegion>
<name sortKey="Jutten, Christian" sort="Jutten, Christian" uniqKey="Jutten C" first="Christian" last="Jutten">Christian Jutten</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000557 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000557 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:12-0083116
   |texte=   Linear-quadratic blind source separating structure for removing show-through in scanned documents
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024